Picture for Yu-Wing Tai

Yu-Wing Tai

Tencent

CoT-Seg: Rethinking Segmentation with Chain-of-Thought Reasoning and Self-Correction

Add code
Jan 24, 2026
Viaarxiv icon

ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation

Add code
Dec 08, 2025
Figure 1 for ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Figure 2 for ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Figure 3 for ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Figure 4 for ContextAnyone: Context-Aware Diffusion for Character-Consistent Text-to-Video Generation
Viaarxiv icon

SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents

Add code
Jun 05, 2025
Figure 1 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Figure 2 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Figure 3 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Figure 4 for SmartAvatar: Text- and Image-Guided Human Avatar Generation with VLM AI Agents
Viaarxiv icon

MA-RAG: Multi-Agent Retrieval-Augmented Generation via Collaborative Chain-of-Thought Reasoning

Add code
May 26, 2025
Viaarxiv icon

Agentic 3D Scene Generation with Spatially Contextualized VLMs

Add code
May 26, 2025
Viaarxiv icon

ThinkVideo: High-Quality Reasoning Video Segmentation with Chain of Thoughts

Add code
May 24, 2025
Viaarxiv icon

FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation

Add code
Mar 27, 2025
Figure 1 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Figure 2 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Figure 3 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Figure 4 for FusionSegReID: Advancing Person Re-Identification with Multimodal Retrieval and Precise Segmentation
Viaarxiv icon

Multimodal Generation of Animatable 3D Human Models with AvatarForge

Add code
Mar 11, 2025
Viaarxiv icon

ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation

Add code
Mar 10, 2025
Figure 1 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Figure 2 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Figure 3 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Figure 4 for ReelWave: A Multi-Agent Framework Toward Professional Movie Sound Generation
Viaarxiv icon

Dynamic Path Navigation for Motion Agents with LLM Reasoning

Add code
Mar 10, 2025
Figure 1 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Figure 2 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Figure 3 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Figure 4 for Dynamic Path Navigation for Motion Agents with LLM Reasoning
Viaarxiv icon